Recycling Randomness with Structure for Sublinear time Kernel Expansions
نویسندگان
چکیده
We propose a scheme for recycling Gaussian random vectors into structured matrices to approximate various kernel functions in sublinear time via random embeddings. Our framework includes the Fastfood construction of Le et al. (2013) as a special case, but also extends to Circulant, Toeplitz and Hankel matrices, and the broader family of structured matrices that are characterized by the concept of lowdisplacement rank. We introduce notions of coherence and graph-theoretic structural constants that control the approximation quality, and prove unbiasedness and low-variance properties of random feature maps that arise within our framework. For the case of low-displacement matrices, we show how the degree of structure and randomness can be controlled to reduce statistical variance at the cost of increased computation and storage requirements. Empirical results strongly support our theory and justify the use of a broader family of structured matrices for scaling up kernel methods using random features.
منابع مشابه
Online learning of positive and negative prototypes with explanations based on kernel expansion
The issue of classification is still a topic of discussion in many current articles. Most of the models presented in the articles suffer from a lack of explanation for a reason comprehensible to humans. One way to create explainability is to separate the weights of the network into positive and negative parts based on the prototype. The positive part represents the weights of the correct class ...
متن کاملSampling Techniques for Kernel Methods
We propose randomized techniques for speeding up Kernel Principal Component Analysis on three levels: sampling and quantization of the Gram matrix in training, randomized rounding in evaluating the kernel expansions, and random projections in evaluating the kernel itself. In all three cases, we give sharp bounds on the accuracy of the obtained approximations. Rather intriguingly, all three tech...
متن کاملRandomized Algorithm for Approximate Nearest Neighbor Search in High Dimensions
A randomized algorithm employs a degree of randomness as part of its logic with uniform random bits as an auxiliary input to guide its behavior, in the hope of achieving good average runtime performance over all possible choices of the random bits. In this paper, we formulate a randomized algorithm capable of finding approximate nearest neighbors, specifically in high dimensional datasets. The ...
متن کاملFPGA Implementation of Point Multiplication on Koblitz Curves Using Kleinian Integers
We describe algorithms for point multiplication on Koblitz curves using multiple-base expansions of the form k = ∑ ±τ(τ − 1) and k = ∑ ±τ(τ −1)(τ− τ −1). We prove that the number of terms in the second type is sublinear in the bit length of k, which leads to the first provably sublinear point multiplication algorithm on Koblitz curves. For the first type, we conjecture that the number of terms ...
متن کاملOn the Randomness of Pi and Other Decimal Expansions
Tests of randomness much more rigorous than the usual frequency-of-digit counts are applied to the decimal expansions of π, e and √ 2, using the Diehard Battery of Tests adapted to base 10 rather than the original base 2. The first 10 digits of π, e and √ 2 seem to pass the Diehard tests very well. But so do the decimal expansions of most rationals k/p with large primes p. Over the entire set o...
متن کامل